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We introduce a new nameless representation of lambda terms inspired by ordered logic. At a lambda 
abstraction, number and relative position of all occurrences of the bound variable are stored, and 
application carries the additional information where to cut the variable context into function and 
argument part. This way, complete information about free variable occuiTence is available at each 
subterm without requiring a traversal, and environments can be kept exact such that they only assign 
values to variables that actually occur in the associated term. Our approach avoids space leaks in 
interpreters that build function closures. 

In this article, we prove correctness of the new representation and present an experimental eval- 
uation of its performance in a proof checker for the Edinburgh Logical Framework. 

Keywords: representation of binders, explicit substitutions, ordered contexts, space leaks. Logi- 
cal Framework. 



1 Introduction 

Type checking dependent types in languages like Agda [21] and Coq fT4] or logical frameworks like 
Twelf [22J requires a large amount of evaluation, since types may depend on values. Such type checkers 
incorporate an interpreter for purely functional programs with free variables — at least, the A -calculus — 
which is used to compute weak head normal forms of types. Efficiency of type checking is mostly 
identical with efficiency of evaluatiorj^ (and, in case of type reconstruction, efficiency of unification), 
and remains a challenge as of today. In seminal work, Gregoire and Leroy [ 13] have sped up Coq type 
checking by compiling to byte-code instead of using an interpreter. Boespfiug [6 1 has obtained further 
speed-ups by producing native code using stock-compilers. 

While compilation approaches are successful on batch type checking fully explicit programs, they 
have not been attempted on type reconstruction using higher-order unification or on interactive program 
construction such as in Agda and Epigram ifTSl . These languages are involved and constantly evolv- 
ing, and their implementations are prototypes and frequently modified and extended. Implementing a 
full compiler just to get type reconstruction going is deterring; furthermore, compilation has not (yet) 
proven its feasibility in minor evaluation tasks (like weak head evaluation) that dominate higher-order 
unification. At least for language prototyping, smart interpreters are, and may remain, competitive with 
compilation. 

For instance, Twelf 's interpreter is sufficiently fast; it is inspired by a term representation with de 
Bruijn indices |7] and explicit substitutions [JJ. In the context of functional programming, explicit sub- 
stitutions are known as closures, consisting of the code of a function plus an environment, assigning 

'Evaluation is necessary to reduce types to weak head normal form and compare types for equality. Subtracting these 
operations, type checking has linear complexity. 
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values to the free variables appearing in the code. In typical implementations of interpreters ||9l, these 
environments are not precise; they assign values to all variables that are statically in scope rather than 
only to those that are actually referred to in the code. This bears potential for space-leaks: the environ- 
ment of a closure might refer to a large value that is never used, but cannot be garbage collected. An 
obvious remedy to this threat is, when forming a closure, to restrict the environment to the actual free 
variables; however, this requires a traversal of the code. We explore a different direction: we are looking 
for a code representation that maintains information about the free variables at each node of the abstract 
syntax tree. 

A principal candidate is linear typing in Curry-Howard correspondence with Girard's linear logic 
|[T2l ; there, each variable in scope is actually referenced (more precisely, referenced exactly once). In 
other words, the free variables are exactly the variables in the typing context. Dropping types, we may 
talk of linear scoping. Yet we do not want to represent linear terms, but arbitrary A -terms. Kesner 
and Lengrand lITSl achieve this by introducing explicit term constructs for weakening and contraction. 
We pursue a different path: we incorporate information about variable use and multiplicity directly into 
abstraction and application. 

In the context of linear A-calculus, the free variables of a function application are the disjoint union 
of the free variables of the function and the free variables of the argument. If we want to maintain the 
set of free variables during a term traversal, at an application node we need to decide which variables go 
into the function part and which into the argument part. Thus, we would store at each application a set of 
variables that go into the, say, function part, all others would go to the argument part. Less information 
is needed if we switch to an ordered representation. 

Ordered logic, also called non-commutative linear logic |23], refines linear logic by removing the 
structural rule exchange which restricts hypotheses to be used in the order they have been declared. 
Transferring this principle to ordered scoping this means that the scoping context lists the free variables 
in the order they occur in the term, from left to right. This allows pushing the context into an application 
with very little information: we just need to know how many variables appear in the function part so we 
can cut the context in two at the right position, splitting it into function context and argument context. 
This constitutes the central idea of our representation: at each application node of the syntax tree, we 
store a number denoting the number of free variable occurrences in the function part. During evaluation 
of an application in an environment, we can cut the environment into two, the environment needed for 
the evaluation of the function and the environment needed for the evaluation of the argument. Thus, our 
environments are precise and space leaks are avoided. In particular, a variable is always evaluated in a 
singleton environment assigning only a value to this variable. Following this observation, variables do 
not need a name, they are identified by their position; and environments are simply sequences of values. 

Since we are not interested in proof terms of ordered logic per se, but only borrow the ordered 
context idea for our representation of untyped A -calculus, we need to allow multiple occurrences of the 
same variable. In fact, the context shall list the variable occurrences in order. At a lambda abstraction, 
we bind all occurrences of the same variable. Thus, at an abstraction node we specify at which positions 
the bound variable should be inserted in the scoping context. This concludes the presentation of our idea. 

In the rest of the paper, after an introductory example (Section [2]l we formally define our term rep- 
resentation in Section [3] Interpreter and handling of environments are described in Section |4] followed 
by the translation between ordinary lambda terms and ordered terms (Section[5]l. Soundness of the inter- 
preter is formally proven in Section |6] before we conclude with an experimental evaluation in Section [7] 

This article summarizes the B.Sc. thesis of the second author |[T6ll . 
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2 An Example 

To demonstrate the discussed risk of space-leaks during evaluation, we apply the term Xx.Xy.aby in 
basic syntax consecutively to the free variables g and /. A possible (and, if the mentioned closures are 

used, typical) sequence of reduction steps is given below. By writing t[si/xi, . . . ,Sn/xn], we want to 
express that in the term /, each occurrence of the variable xi (x2,-.- ,x„) has to be replaced by the term 
si (resp. S2, ■■■,Sn) simultaneously. Such a substitution list always applies only to the directly preceding 
term: 

{Xx. Xy.ahy) g f 
— > {Xy.aby)[g/x] f 
{aby)[g/xj/y\ 
{ab)[g/x,f/y] y[g/xj/y] 
a[g/x,f/y] b[glxj/y] f 
abf 

Here, the substitution [g/x] could be dropped instantly and there is no need to apply the other substitution 
[fly] to the term ab. However, the term representation used above comes along with the problem that 
such an evaluation algorithm does not have the required information in time. This is due to the fact that 
the binding information is always split between the A and the actual variable occurrence, as they both 
carry the variable name. In contrast, using de Bruijn indices would make it possible to remove the piece 
of information from the X. Our goal is to do it the other way round: We want the whole information 
to be available at the X, thus making it possible to know the number and places of the bound variable 
occurrences without looking at the whole term. 



3 Syntax 

In this article, we only cover the core constructs of the lambda calculus as they are enough to make 
the approach clear. However, we do not see any limitations for common extensions. We first define 
ordered preterms : 

ordered preterms 3 t,u ::= x free variable (named x) 

I • bound variable (nameless) 
I t'^u application 
I X'^.t abstraction 

Free variables are denoted by their name like in the standard syntax. Bound variables, however, are just 
denoted by a dot •, which does not carry any information beside the fact that it is a bound variable. In 
the case of an application, there is a first term (the function part) and a second term (the argument) as 
usual. Furthermore, the application carries an integer m as an additional piece of information that will 
be important for the evaluation process and is explained in a moment. The most interesting part is the 
abstraction X'^.t. The vector ^= \ki,k2, ...,/:«] is nothing else but a hst of non-negative integers of length 
n. It determines which dots • are bound by the X in the following way: Consider all • in the term t which 
are not bound in t itself. Now, the first k\ of these are not bound by the X, the next one is, the following 
k2 are again not bound and so on (see examples below). 
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We denote the number of unbound • in an ordered preterm f by fV (t). Consequently, fV(-) is simply 
defined by: 

fV(x)=0, fV(.) = l, fV(?'«M) = fV(0+fV(M), fV(A[*^i'-*^"1.0 = fV(0-«. 

We call an ordered preterm u an ordered term iff each sub-preterm w of m fulfils the following condition: 
If w is an application t'"u, the equation m = fV(f) holds and if it is an abstraction X^^^''"''^"\t, then 
n + ki < fV(?) is satisfied. The latter condition states that if a A in m binds a variable, this variable must 
actually exist while the first one just gives a meaning to the integer carried by an application. Clearly, 
any sub-preterm of an ordered term is again an ordered term, u is called closed if fV(f) =0. 
Here are some examples of closed terms. The S combinator 

Xx.Xy.Xz.xz (yz) 

would be written as 

am.aw.a[1'1Ui.2(.i.) 

Moreover, the term 

{Xx.Xy.aby)gf 

from Section|2] would be represented as 

(note that applications are still left-associative). We can see that the first A does not bind anything as it 
is annotated with the empty vector [] , while this is less obvious when it is written as Ax. 

At this point, we hope to have clarified the intended meaning of our syntax. A formal definition will 
be given in Section |5j 

4 Values and Evaluation 

Before specifying values and evaluation formally, we want to give an example to demonstrate how the 
information carried by a lambda should be used and why we always have exactly the needed information. 
Suppose we have the term 

(AM.AW.A[i'il..i.2(.i.))g/„ 

that is, the S combinator applied to three free variables (we suppress the application indices for better 
readability). We want to get rid of the beta redexes, so we start by eliminating the first one. The outermost 
A is decorated with the vector [0] of length one. Now, the single variable bound by this A should be 
replaced by g, so we start a substitution list and insert a single g: 

(aW.AM..i.2(.i.))[^] /. 

The first remaining A is A^'l, so it does not bind the first variable (thus g remains first in the substitution 
list), but the second one. Consequently, we add an / after the g: 



(A[^'^Ui.2(.^.))[g,/] n 
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Now the situation becomes more interesting. The only remaining A is now decorated with the vec- 
tor [1,1], so one n has to be inserted after the first entry (g) and another one must be placed after the 
subsequent entry (/): 

All A have now been eliminated. The applications' indices tell us how the substitution list should be 
divided between the terms: 

We do the same step once more: 

.[g] .[n] (.[/] .[«]) 

The only thing left to be done is to apply the substitutions in the obvious way: 

gn{fn) 

Here, evaluation naturally leads to a term in beta normal form. This is not always the case: as an 
example, if we had tried to evaluate the above term without the n (i.e. we would have got stuck 

at (AI^'^I. • ^ (• ^ •)) [g,f]. However, this would have been satisfactory as itwouldhave shown that the 
term's normal form is an abstraction. In other words, our evaluation results in weak head normal forms. 
Consequently, we define values in the following way: 

values 3 v,w ::= xv large application 
I {X'^.ty closure 

The large applicatioj^consists of a variable x which is applied to a vector [vi,V2, . . . ,v,„] of values. It is 
to be read as a left-associative application, i.e. as ((xvi) V2 . . .) v^. Note that it is not necessarily "large". 
Quite the contrary, it often only consists of the head (and the vector of values is empty). 

A closure {X'^.ty is the result of the evaluation process if the corresponding beta normal form of the 
term does not start with a free variable. The main part, A*^. t, is nothing other than a lambda abstraction in 
the syntax of ordered terms. In addition, we need the substitution list v (which is simply a list of values) 
that satisfies length(v) = fV(A*^.f). The idea is that the unbound • is to be replaced by v,-. These 
substitution lists have already been used in the example above. 

At this point, we want to introduce a notation for inserting a single item multiple times into a 
list. More precisely, if v = [vi,V2, . . . ,v,„] is a list, k = [k\,k2,. ■ ■ ,kn] is a vector (i.e. also a list) of 
non-negative integers satisfying ^JLj kj < m and w is a single item, we write P^'^'^ for the list that 
is constructed by inserting w at each of the positions ki,ki + ^2, • • • ^' ^^^'^ ^' i-^- "^^e list 
[vi,V2, . . . ,V/t, ,w,v<:(+i, . . . ,V/t,+/;2,H',v<:[+i:2+i, . . . ,Vm] (of coursc, it is possiblc that v*^'/*^ starts or ends 
with w). 

We are now able to define the evaluation function [[-J. which takes an ordered term t as well as 
an ordered substitution list v and returns a value. The tuple must always satisfy the condition fV(f) = 
length(v). In other words, the list carries neither too little nor redundant information. At the start of the 
evaluation, the ordered substitution list is empty. Additionally, we specify the application • @ • of two 
values, which also returns a value and does not need anything else. Our evaluation procedure uses a 
call-by- value strategy: 

^ The argument vector v of a large application is sometimes called a spine Large applications xv also appear in the 
formulation of Bohm trees (4). 
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H[] = X (1) 

[•l[v,] = VI (2) 

P'«l[v,,...,v„] = Pl[v,....,v,]@H[v,,.,...,v„] (3) 

IXltj, = (Xltf (4) 

(xv) @ w = x[v,w] (5) 

(A^O'@>v = PI,.,? (6) 



First, if we want to evaluate a free variable (1), the substitution list must be empty because of the invariant 
mentioned above. Second, in the case of a • (2), the ordered list must have exactly one entry. This entry is 
the result of the evaluation. If we evaluate an application (3), we evaluate the left and the right term. The 
application's index enables us to split the substitution list at the right position. Then, we have to apply 
the first result to the second. Evaluating an abstraction (4) is easy. We just need to keep the substitutions 
to build a closure. 

If we want to apply a large application to a value w (5) , we just append w to the vector of values (we 
write [v,w] for [vi, . . . ,v„,w]). The case of a closure {X'^.ty (6) is less simple, but it is still quite clear 
what to do: k determines at which positions w should be inserted in the ordered substitution list, so we 
just construct the list v*/*^. Then, t is evaluated. 

Concerning substitution lists, we talk about "lists of values" for simplicity. More specifically, we 
want them to be lists of pointers to avoid the duplication of "real" values during constructing lists 
like v""/*^. Instead of simple linked lists, we have also implemented these lists as dynamic functional 
arrays represented as binary trees. This reduces the asymptotic costs of list splitting — [vi,...,v„] to 
([vi, . . . ,V;t]5 . . . ,v„]) — and multi-insertion v"'/^ from linear to logarithmic time (in terms of the 
length of the list v). For an experimental comparison of the two implementations see Section|7] 

5 Parsing and Printing 

In this section, we define how terms in normal syntax are translated into our ordered syntax (Parsing) 
and vice versa (Printing). To specify this, we need some notation. First of all, we write X for the set of 
variable names we want to use and T for the set of lambda terms in basic standard syntax (i.e. X C T, 
furthermore, x G X together with t,u £ T implies tu £ T and Xx.t € T). Additionally, oT is the set of 
terms in our ordered syntax defined above, oT the subset of closed ordered terms (fV(f) = 0) and V the 
set of values (defined in the previous section). Moreover, we write X for the set of lists of variable names 
and V for the set of lists of values. For each set F of variable names (F C X), we denote the set of lists 
of elements of F by F and the ordered terms that do not contain any variable of F (as a free variable) by 
oTp. 

By writing oT ®X (resp. oT ® V, oTy (E> F, . . .), we mean the subset {(?,x) | fV(f) = length(x)} of 
oT xX (and analogous for the other cases). 

For a finite set F of variable names, we define the correspondence relation • <l F l> • [•] C T xoT xX 
(pronounce: "corresponds in context F to"). The intuition is that M<|F|> u[x\ means: M is a term that 
corresponds to the ordered term u, where unbound • are replaced by the (not necessarily pairwise distinct) 
variables in the list x. The set F can be seen as a filter that tells us which free variables do not occur in u 
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but in X instead. 



xer ... M<ir\>t[x\ N<r\>u^ length(x)=m 



x<ri>«[x] ' ' MN<ir>{t'"u)[x,y\ 

x^r M<(ru{z})>^K^] 



x<]r>xO ^ ^ Az.M<r>(A^f)[^] 

It is important to note that M <l F > m [x] implies that each variable occurring in x is contained in F and 
each free variable occurring in u is not contained in F. This can be shown by induction on M (simulta- 
neously for all sets F). 

By the same argument, one can see that (for each M and F) there exists a unique tuple {u,x) satisfying 
M<]F[> m[x], so we can consider (<F|>) a function T oTr x F. Furthermore, we note that M<] Fo m[x] 
always implies f\/{u) = length(x). 

This also works the other way round. For each F and each tuple {u,x) G oTr^F, there is (by induction 
on u) a term M ^ T satisfying M <l F > m[x]. Moreover, this term M is unique up to a equivalence. So, 
(<|F|>) is actually a bijection between T/a (the set of a equivalence classes of terms) and orr^F. 
The inference rules above show how to apply this bijection or its inverse to a term or a tuple (in the 
last rule, any variable satisfying the condition can be chosen for z), so we have a computable bijection 
T /a ^ oTy®T determined by the rules for (<F|>). 

Choosing F = 0, we get a bijection T / a oT 0®. As is only inhabited by the empty vector [], we 
naturally get the parse function t> which maps T/a bijectively on oT. 

The above construction also gives us a function oT — > T, but this is not enough. We want to transform 
closures (elements of oT V) and values (elements of V) into basic terms T. Therefore, we define the 
two print functions <i : oT iSiV ^ T / a and <]:V ^T/a simultaneously by recursion on the structure: 

[]) = X (I) 

<](., [v]) = <(v) (H) 

<C (^'"m , V) = <i{t , V,tart) <i {U , Vrest) (HI) 

(split V at position m to get Vstart and Vrest) 

(iKt , vj = Az.^ (t , v^/^) (IV) 
where z is any variable that does not occur freely in ? or v 

<(XV1 V2 . . . V„) = X<(vi) <(V2) ... <(V„) (V) 

(a large application simply becomes an application of terms) 

<(^(A*.0^) = <i {^-t , v) (VI) 

First, note that the printing functions are well-defined (i.e., they always terminate). This is because 
during evaluation of <1 (m , v), we may safely assume that ^ (f , w) is well-defined as long as ? is a strict 
subterm of u and each value w' in w is either only a variable (so termination of <\{w') is clear) or also in 
V. Similar, during evaluation of <(xvi . . . v„), we may assume that <](v,) is defined for each /. 

For all ? G oT , M G r, we have ^ (/ , [] ) = M if and only if M <1 1> / [] (which just means \>t=M) 
as both judgements are defined identically in the case of closed ordered terms. This essentially (with 
implicit use of the bijection oT -f-> oT (g) 0) means o > = idr, i.e. the composition of parsing and 
printing is the identity. 
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6 Correctness and Termination properties 



We still have not shown that our evaluation algorithm given in Section [4] does not change the meaning 
of terms. The combination of parsing, evaluating and printing should never result in a term that is not 
beta equivalent to the original term. We also want to show a limited termination property. To keep our 
argument simple, we just sketch the proofs and hope that the ideas become clear. 

First, we attend to the correctness question. We need to convince ourselves that rewriting according 
to the rules of the functions [[•]]. and @ does not cause an error. By rewriting, we mean one step of the 
normal or leftmost outermost evaluation. We have demonstrated this in the example at the beginning of 
Section [4] Printing should result in a term that is j3 equivalent to the term we get if we rewrite before 
printing. This basically means that, for each evaluation rule on the left hand side of the following table, 
we have to check that the equality on the right hand side holds: 



{xv) @w 
{X^.tf@w 



k t\V 



x[v,w\ 



<\{x, []) 
<)(• , v) 



{t'^u , [v,w]) 



--p <v) 



\{t , v)<]{u , w) 



<\{xv) <\{w) 



<}{{?i'jy]<i{w) 



<i{x[v,w]) 



t , V 



w/k 



(1) 

(2) 
(3) 

(4) 

(5) 

(6) 



Note that "rewriting using the evaluation rules" results in expressions which are, more or less, a 
mixture of elements of oJ (g) V and V . To be precise, such an expression is either moT ®V or in V or 
a tuple (to read as simple application) of two expressions. The P', 2'"^ and 4''' rule turn an element of 
oT into a value, the turns it in a tuple of two such elements, and the last two rules turn a tuple of 
two values into one value or element. Although we do not define it formally, it should be clear how those 
expressions can be printed by using the printing functions for ordered terms and values (if an expression 
is a tuple, just print the function part and the argument part separately). In fact, while the function [[-J. is 
formulated as a big step evaluation, the rewriting process can be understood as the corresponding small 
step (or one step) evaluation. 

In the first five rows of the table, we only have to look at the definitions of the printing functions to 
see that the printed terms are not only jS equivalent but also equal. The very last rule (6) requires closer 



examination: By rule (IV), the term <]( (A^.f)^') is equal to Xz- <\ [t 



for a (sufficiently) fresh 



variable z- Now, the definitions of the printing functions are "context free" in a way that guarantees that z 
occurs exactly length(^) times (free) m<\(^ , v^/^^ . Furthermore, replacing those occurrences by <\{w) 

results in the term <\ , v'"/^^ . This means that, starting with <l^(A^. f)''^ <\{w), we have to use exactly 

one j8 reduction step to get the term -d , v**^^/^^ . 

As we have already seen that the composition of parsing a term and printing it afterwards does not 
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change anything (up to a equivalence), we can now conclude that parsing, evaluating (a finite number of 
rewriting steps) and printing is equivalent to a number of j8 reduction steps. 

Now we discuss termination. Obviously, our evaluation function [•]]. does not always terminate as 
some terms do not have a weak head normal form. However, [[-J. terminates whenever it is applied to 
(> t) if the usual jS reduction is strongly normalizing on t. The main consequence of this is that evaluation 
terminates for all well-typed terms. To prove this statement, assume that there is such a term sq^T so that 
the evaluation of ?o := l>^o does not terminate. Then, we get an infinite sequence to,ti,t2,--- where ti+i is 
the result of rewriting (a subexpression of) ti using one of the evaluation rules. If we print to,t\ , f2, • • •, we 
get a sequence sq,si,S2, ... of terms in T, where is either (a) equal to Si or arises from Sj in exactly 
one j3 reduction step. If j8 reduction is strongly normalizing on sq, the sequence has to become constant 
at some point, i.e. Sf^ = S]\i^i = = ... for some N. This implies that, after the first N rewriting steps, 
rule (6) is not used anymore. Define the weight w{t) of an expression f to be 1, if the expression is just an 
element of or V, to be 2, if it is a value of the closure type, to be 1 +2" + w(vi) +w(v2) + . . . + w(v„), 
if it is a value of the form xviV2 • • • v„ (i.e. a large application) and, if it is a tuple of two expressions, as 
the sum of both weights. Then, each of the rewritings that are induced by the first five lines in the table 
increase the weight of the expression, so we get w(f/v) < w{tf^^i) < w{tN+2) < ■ ■ ■', however, as the total 
number of values (and tuples in oT V) is bound by the length of the term we get after printing tf^ (or 
any tN+i), the sequence is bounded, resulting in the required contradiction. 



7 Experiments and Results 

The specified term representation and evaluation have been implemented in Haskell. They have been 
used by a type checker to check large files of dependently typed terms of the Edinburgh Logical Frame- 
work which were kindly provided by Andrew W. Appel (Princeton University). To make this possible, 
an extended syntax has been used that includes Il-types, constants and definitions. It is straightforward 
to expand our evaluation algorithm to the extended syntax — for details consult the Bachelor's thesis of 
the second author |T6l . The substitution lists have been implemented as simple Haskell lists, and also 
as balanced binary trees (following Adams |i2J) for better asymptotic complexity. Both variants were 
evaluated for performance [referred to as Ordered (trees) and Ordered (lists)]. 

For comparison, the completely analogous algorithm for terms in extended basic syntax (i.e. T) has 
been used [Simple Closures]. Furthermore, we have tested a strategy that always evaluates completely 
(i.e. produces jS normal forms) using Hereditary Substitution [Beta Normal Values] [i25il . 

Our main test file w32_sig_semant.elf with a size of approximately 21 megabytes contains a proof 
described in lO. We also tested smaller parts of this file, more precisely, the first 6000, 10,000 and 
12,000 lines without the rest (named 6000. elf and so on). Later terms tend to be larger, so the tests with 
fewer lines needed much less time. 

All tests were executed on the same server baerentatze . cip . if i . lmu.de working with a CPU 
of type AMD Phenom II X4 B95 (only one core used, 3 GHz) and 8 GB system memory. The measure- 
ments of space and time consumption are given in the following tables (rounded average values). More 
specifically, time refers to the total time, including parsing the input file and transforming (if necessary) 
the representation to our ordered representation or to one that uses De Bruijn terms. However, in our 
tests, the transformation process only took a negligible amount of time. The time value does, however, 
not include any printing of the terms or the values (with printing, the total time increased significantly). 
Space refers to the peak space usage of the whole process. 
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6000. elf (file size: 3.8 MB) 





time (sec) 


space (MB) 


Ordered (trees) 


18.9 


1111 


Ordered (lists) 


18.6 


1114 


Simple Closures 


18.5 


1152 


Beta Normal Values 


27.6 


2034 


10000. elf (file size: 12.9 MB) 




time (sec) 


space (MB) 


Ordered (trees) 


61.0 


3230 


Ordered (lists) 


60.6 


3237 


Simple Closures 


60.0 


3302 


Beta Normal Values 


98.7 


5878 


12000. elf (file size: 17.8 MB) 




time (sec) 


space (MB) 


Ordered (trees) 


84.3 


5096 


Ordered (lists) 


83.8 


5103 


Simple Closures 


83.6 


5226 


Beta Normal Values 


137.7 


8513 



Unsurprisingly, beta normal values perform significantly worse than each of the other possibilities. 
However, the difference is smaller than it could have been expected. This might be due to the fact 
that during type checking, total evaluation of a term is often necessary anyway, thereby reducing the 
hereditary substitution's disadvantage. 

Although none of the other strategies exhibited any shortcomings in the comparisons above, the 
following results for the complete file are remarkable. Here, implementing ordered substitutions as 
normal Haskell lists seems to be much more efficient than using tree structures: 



w32_sig_semant.elf (file size: 20.9 MB) 





time (sec) 


space (MB) 


Ordered (trees) 


108.4 


8877 


Ordered (lists) 


94.8 


4948 


Simple Closures 


94.3 


5068 


Beta Normal Values 


169.8 


9044 



Our Simple Closures are still on the same level as Ordered Representation with lists, but the trees 
are far behind. In comparison, the type checker of the Twelf project, Tw elf r 1697 (written in Standard 
ML and compiled with MLton's whole program optimizations ifTTTl ) does the job nearly five times faster 
while using only 2720 megabytes of memory. 

8 Related Work and Conclusions 

Our term representation is inspired by intuitionistic implicational hnear logic in natural deduction style 
which has explicit operations for weakening and contraction Q. With explicit weakening and contrac- 
tion, one easily maintains complete information about the free variables of a term at each node [15]. Our 
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term representation incorporates weakening and contraction into lambda abstraction. By using inspira- 
tion from ordered logic, we reduce the stored information at application nodes to a minimum, namely an 
integer; further, our variable nodes need to carry no information at all. 

Another means to maintain information about free variables are director strings by Sinot fl^ . Ap- 
plication nodes come with a map that tell for each variable whether it appears in the left or the right 
subterm or in both. Our term representation can be seen as an optimized version of director strings, 
however, we have no experimental comparison. Sinot et. al. [TO] present some performance results of 
director strings; however, it is restricted to evaluation of some specific big lambda-terms. There is no 
study on their relative performance in a realistic application — yet that is our concern. 

An alternative to explicit substitutions is Nadathur's suspension calculus |fT9l, which, in a refinement, 
also maintains information about closedness of subterms. In this refinement, the suspension calculus 
maintains at least partial information about the free variables of a subterm. As the basis of an imple- 
mentation of A -Prolog II2O1I . Nadathur has proven the efficiency of his term representation not only for 
normalization and equality checking, but also for higher-order unification. 

Building on the suspension calculus, Liang, Nadathur, and Qi fTT\ have evaluated different term 
representations in the context of AProlog, a study that compares to our study of term representations for 
the Edinburgh Logical Framework. They have tested different combinations of features, confirming our 
result that lazy substitution is preferable to eager substitution [Beta Normal Forms], even more so when 
several substitutions are gathered into one traversal [Closures, Ordered]. They also test a variant where 
each term is equipped with an annotation, a flag telling whether this term is open, i. e., has free variables, 
or closed, i. e., has no free variables. In their experimental evaluation, these annotations pay off greatly 
for the poorly behaving eager substitution, yet give negligible advantage for explicit substitutions. It 
is hypothesized that in a combined substitution, each subterm will mention at least one variable with a 
high probability, so the traversal has to run over most of the whole term — this is certainly different in the 
substitution for a single variable. 

To summarize, we have presented a new term representation for the lambda-calculus inspired by 
ordered linear logic, and experimentally compared it with well-known representations (closures, normal 
forms) in a prototypical implementation of a type checker for the Edinburgh Logical Framework. The 
experiments were carried out on large realistic proof terms, constructed manually and mechanically. 

The results were not significantly in favor of our new representation. This might be due to the appli- 
cation domain, LF signature checking. For one, LF-definitions are closed, which means that substitutions 
never need to traverse a definition body when the definition is expanded, and this optimization is shared 
by all the term representations we compared. Secondly, we only tested type checking, not type recon- 
struction via unification. During type checking, where equality tests are expected to succeed, full normal 
forms are always computed, and closures are very short-lived in memory. More space leaks are to be 
expected in applications such as logic programming or type reconstruction, where unification is needed, 
which is not expected to always succeed. In constraint-based unification, unsolvable constraints might be 
postponed, keeping closures alive for longer. In such situations, the benefits of our representation might 
be more noticeable, more experiments are required. 

In the future, we plan to investigate further term representations such as term graphs, and perform 
more experiments. The literature on experimentally successful term representations is sparse, our work 
contributes to close this gap. Our long term goal is to find a term representation which speeds up Agda's 
type reconstruction. 
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